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The Basch itea analysis model is supposed to yield 
norm^free estimates of ability and easiness values, ^but there are 
several possible interpretations of the nature and extent of such 
norm- freeness.^ One such interpretation was that to involve the scores 
of one single experimental group of testees which were embedded in 
four differently skewed distributions of o^ther scores, the testees 
Itad been administered easy" and hard sets of items for the 
verifications of person-free and item-free ability estimations. 
Similarly for the verifications of person-and item*£ree easiness 
estimation by the Rasch model, the study involved the formation of 
four differently ^skewed sets of items in terms of their proportion 
right easiness values among which the same single set of experimental 
items had been embedded. These four sets of items were administered 
/to bright and' dumb grcJups of examinees. The no-guessing and constant* 
discrimination power assumptions of the Rasch modal were respectively 
made to be satisfied by using the cloze test blanks as items and by 
removing those blanks outside a narrow range of discrimination 
indipes*' B^ause o.f the pLQSsii)ility that the estimation errors, may 
critically^epend on the number of ties at each raw score level 
making the Rasch estimates of ability and easiness statistically 
different from one group of examinees or set of items at each of 
these^ score* levels, a linear ^prediction model was used with the raw 
pupil scores or item easiness as predictors, (Author) 
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SOME METHODOLOGICAL CONSIDERATIONS IN THE TESTING OF EAS'CH MODEL CLAIMS 



1. INTRODUCTION 

Ever since Gullik^n's U9S0) ^xpresslofi of the ne^d" for a response-trait 
model which would yield norm-free estimates of .pupil-abilities and'ltem-easl- 
nesaes,^j:here have been numerous attempts by Lazarfeld, Lord, itesch, Bimbaum, 
and others in this direction. Except for the. normal-ogive model of Lord, all 
I ^ the others arc completely arbitrary mathematical functions purported to repre- 

sent the binary responses of actual life. The simplest of all is that of 
Rasch (1966). The essence of his model is epitomized in the expression for 

Probability of getting an item of easiness e^ right by a testee 
of ability a^ ' s ° ■> 

PjjCx) - (aiej)t/(i + aiej) 

with X equal to' unity. In the above equation, x Is the random' 
variable taking values 0 or 1 accordipg to whether the proba- 
bility is for the item bSing scored wrong or right. 

Thus, .^he very basic equation starts wj5th two separate parameters of person 

abiUty ai and item easiness ej , with the assumption that they are independent 

and; can likewise be estimated: easiness without regard to what sample of persons 

is used and ability without concern about what set of items we have at hand.* 
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Rasch (1966), in the abstract to his paper, says "An approach to item analysis 
^ fLffJf ? ^ ""^^"f °^ "^^^^ difficulty of an item and the ability of an 
^ individual ma^ sometimes be assessed without reference to the norms provided by 
■ ^ some population. IThat the conditions are, to make the norm-^free estimation 
^ possible are not explicitly mentioned, -nor is indicated what is meant by "some 
QQ population. The condition seems to be that of usirtg the row and column fre- 
quencies of correct responses of the ordered item-person response matrix » as 
_^ T^X T\ ^^'f if easiness estimation; moreover, Rasch adds 

^ in the body of his paper, "..the paramett^r of the subjects in the subgroups may ' 
be evaluated without regard to the parameter of .the other subjects; and, of 
• course, it has already been shown these, will all be independent of the item oara- 
meters. A similar statement hol,ds for the latter." This seems to mean tfat • 
^"^ ^ there aye four freenesses of Rasch estimation: person- and item-free ability 
and easiness estimation. 
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Thl« last claim has been the one under test by several Investigators and 
is also the subject of the present effort. This endeavor was to call 
attention to the various inadequacies of the •past studies and correct 
them where it was possible. * 

r 

2. REVIEW OP LITERATURE 
Brooks (1965) investigated the invariance of Rasch-easiness-estlnrates 
(REE's) with respect to large ability variations in the sample of persons 
Initially u^ed to arrive at the easdlnesses (test of person-free easiness 
estimation) • He used two samples of persons of differing mean ability 
and evaluated the invariance in terms of an "I-index" obtained by talcing 
the sq^^fg^oot of the mean square deviations of the item-points of an 
empirical plot from the straight line prescribed by the Rasch model. He 
found that, in general, the item-points followed the theoretical line. 
His conclusion was bascid on the visual observation of the closeness of the 
observed and the theoretical points. The degree of close^iess of .these 
points allowing for chance errors (due to saiiipllng, measurement, and 
estimation procedures) can be evaluated only by statistical inferential 
methods. 

Wri^ (1968) used a different approach to corroborate the invariance 
'Of parameter estimation by the Rasch model. T|> vetify item-free ability 
estimation, he gave easy and hard halves of a test to a group of .Ss and 
estimated two sets of Rasch abilities for the jsame Ss^from the two sets 
of items. The purpose of his data analysis wis to see how the raw test 
scores (RTS's) compare with the Rasch ability estimates (RAE's). So, he 

found the differences, their means and standard deviations between the 

i . 

two sets of RAE's ailid the two sets of RTS'^s. The mean and standard 
deviation were small! for the RAE's and large for the RTS's. This showed 
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thft the.RAE's are about the same for any group of Ss, no matter what 
kind* of items they ate administered to. Here, Wright compared the me^ 
of the differences and not the individual differences themselves. Rasch 
alght have meant invariance of RAE's for individual persons, and not for 
the group as represented by the mean. Moreover, small mean-difference is 
an' artifact of the logarithmic scale of the RAE's. 

To verify person-free ability, Wright determined the RAE!s and 
percentile scores of a high' and low ability group , corresponding to each 
RTS. He then plotted both the pairs of RAE's ahd percentiles (parts, *' 
, because of the low and the high ability groups) against* the RTS*s. Whereas 
for the RAE's, the plots of the two groups werfe overlapping, they were not, 
for the percentile-pair. Wright concluded thkt for kay RTS, tlie RaSch-j' 
aodel gave the same estimate of "^ility. ' 

Anderson et al (1968) tried to yerify jytem-free easiness, person-free 
easiness, and person-free ability estimatiofas by the Rasch model. The 
latter two verifications were performed by/ correlating .the Rasch easinesses 
and the Rasfch abilities from two groups* of Ss and finding the correlation 
to be high I(.996 and .992 respectively), /it is not known how different 
the two grdups were in the score distribtition statistics like mean and the 
skew. Item-free easiness ^tlmatlqn waaf tested f by correlating the easiness 
scale values with and without the itwJ fitting tihe tnodel. This correla- 
tlon was found- to be high again (.999). It is felt that correlation- is 
not a precise measure of agreement between two sets of values even though 
it may serve to indicate the degree of relationship between the two sets. 

Brink (1970) verified item-free *and peraon-f ree ability estimations 
using simulated data with total scores being of varying standard deviations 
but satisfying normal distribution. He also used data of varying ranges 
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and of rectangular distributions . He concluded that there "no 
systematic differences In fit as .well as no differences In values- of the 
ability estimates." What statistic^ measure he used to gauge the fit is 
not clear. At any rate, his study shows that there is na influence of* 
standard deviation on .the Rasch est^ijnates, • * - ' 

Cypress (1972) found while examining person-fr^e ability ana easiness 
estimations by the Rasch model, that different skews of the score^distriHu 
tion did affect ability and easiness estimatidn. The dependent variable 
to study the effect of person-score-skews on ability estimation was a 
distance measure called the sum of absolute .differences. The differences 
were those between the RAE*s corresponding to the same raw score but for 
two differently skewed distributions - one with known skew and the other 
with zero skew. Cypress' used the expression given by Wri^t and 
Panchapek'esan (1968) for the standard error, of easiness in the investiga- 
tion of person-free easiness estimation. In her'study, the differently 
skewed distributions also had widely different means. It is not apparent 
vhich caused the observed effect on the Rasch estimates. Furthermore, 

compasring merely the raw sum of absolute dif f erencea does not Lak6 lata 

consideration th^ sampling, estimation and other chance errors. 

Tinsley and Dawis (1973) used correlational methods to determine 
whether RAB's were more invariant than percentiles and raw score^when 
they were estimated from tests that were different in item easiness (test 
of. item-free Rasch' ability estimation). They computed the correlation 
between each pair of the three types of* Scores, obtained from hard and 
easy parts of four kinds of analogy subtest^ (word, symbol, number., and 
picture). All the three types of scores were found to be item-dependent. 
The investigators attributed tljis to the failure ^f the tests to meet? 
the assumptions of /the Rasch theoretical model. To quote them, "it is 

5 
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illogical to assume' that tests which do not fit the Rasch inodel yifll still 
have the characteristics ^ittributed to it (Tinsley & Dawis, 1^3)." 

/ • ?. CRITIQUE OF LITERATURE 

Soaa apparent flaws or inadequacies of the foregoing studies are 

listed ^lelow: ' ' , 

^ a) - Whdle Rasch meant independence or invariance of estimation 

in the four respects of person-free ability, person-free easiness, item- 
" free abil^iy and filially item-free easiness, none of the investigators 
studi64 all of them. 

b) Inspite of the knowledge that the Rasch model assumptions were 
no^ satisfied in their data, the investigators proceeded with their 
studies of verifying Rasch model claims. The assumptions of no-guessing 
and of cohstant-discrimination-power in the items of the test used, seem 
to be the major ones violated. That of the unldimensionality of t^e test 
night have been satisfied only in the analogy tests employed by /tinsley 
and Dawls (1973). iSnfortunately, the meeting of the assumption of local 
independence of the item-responses, cannot be evaluated. 

. c) In any verification of person-free ability or item-free 
easiness estimation by the Rasch model, the sample size of persons or 
items, and at least, the first three moments (mean, sd, and the skew) .of . 
the pa'rameter distribution have to be donsidered. When the effect of one 
of these is being studied, the rest of them should be kept constant. None 
of the investigators took this into consideration (A generalised normal 
function is appropriate for this purpose). 

d) Errors of measurement, estimation and sampling c^n be allowed 
for only in the framework of statistical inference. Such ^ route does 
not appear to have bee;i thought of by any of the above-referenced workers. 



Chiisquare tests are less suitable than the parametric P-tests, since the 

Rasch measures are supposed to be ratio scale* 

f * f 

While it is true that Rasch 's claims could be interpreted and tested 

in many ways, the "method of embedding" for the verifications of person- 

^ free ability and item-free easiness, and the usual 'hnethod of variant 

groups" or sets for the (cross) verifications of person-free Easiness 

and item-free ability seemed to be useful to bring out' the different 

nature of the issues involved. It would be edifying^ at this point to 

state the purpose of the present study explicitly and our interpretation 

of Rasch^s claims, when these methods will also be outlined. 

' ^ * 4* PURPOSE OP THE STUDY 

It is the purpose of this study to test through statisticai 
inferential analysis the claims of the Rasch model estimation 
in the four aspects of person- and ltem-fr.ee ability and 
easiness estimation. 

a,b) Person- and item-free Rasch ability estimation: Por these 
cases, the Rasch claims are Interpreted to mean that a^testee or a group 
of testees whose scores are embedded in a series of differently skewed ' 
distributions of sdores, should have the same Rasch ability esTfi^ites^ 
from any of the embedding or host distributions. This defines the case 
of person-free ability estimation. Por item-free ability estimation, the 
Rasck ability estlihates of the experimental Ss (who are coimnbn to all the 
embedding or host distributions) should be the same from the administration 
of either hard or easy item-sets. In effect, the object here is to study 
the effect of skew of the embedding subject-score distribution and that 
of the item-set easiness on the Rasch ability estimation. 

c,d) Person- and item-free Rasch easiness estimation: The easiness 
estimates from the Rasch model of a given item-set should be the same 
, whether they^are estimated from low or high ability Ss. Item-:ftee easiness 
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estimation by the Rasch toodel might be said to be true if a particular set 
of items common to several distributions in easiness values have their 
Rasch easiness estimates unaffected b)^^ their presence in these host distri- 
butions. In summary, our aim here is to study the effect on Rasch 
easiness estimates by the skew of the embedding, easiness distribution 

and by the ability of ttie subject-group used to arrive at the estimates. 

J, 

5. METHOD OF ANALYSIS 
a, b, c, d) i) Data: .The subjects were 226 pupils in the 4th, 5th, 
and 6th grades of three different schools. There were approximately equal 
^number of girls and boys in the sample^ These pupils had taken five cloze 
tests* in social studies, out of which, one of moderate difficulty was 
chosen as the "exp.erimental cloze test" (ECT) a? ttiis was going to be used . 
for the primary purpose of estimating Aasch scores and in the subsequent 
inferential analysis. The^other four, were called the auxiliary cloze 
tests (act's) since these were to be used for the secondary purpose of 
creating the various skewed distributions. The ECT and the ACIjs had all 
about 250 words and about 50 blanks. The blanks of the ECT were considered 
as "items" and the free responses to them were scored zero or one for 
%rrong or right restoration of the deleted word. 

Since the Rasch model estimates can be exRecte^ to follow Rasch 's 
claims only if the model assumptions are metv the assumption pf equal 
discrimination was forced by, discarding items having a discrimination 
index outside a small range of 0.15 from^the ones with the maximum set of 

Cloze tests are a klpd of reading comprehension tests wherein every n'th 
word is systematically deleted from a pasjsage and a blank of standard length 
'is left in its place; pupils fill up the blanks using the context of the 
words pn both sides of the blanks. The percent score of a pupil over the 
number of blanks on ^ randomly selected passage of a book can be taken as 
a reasonable measure of reading ability of the pupil with respect to that 
book. 
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observed^values*^ The allowed range of 0,15 was taken as a reasonably 
small spread/ln a compromise to maintain constant discrimination power for 
a ll Llte -^ems demanded by the model* As far as the assumption of guessing 
being small goes, tKls can be expecte^i to be fairly well satisfied In 'the - 
case of Tree responses, ^Ince the number of alternatives or choices or 
options per item can be taken as extremely large. The reciprocal of this 
number will be smail, which reciprocal Is usually taken as the probability 
of chance response to an ltem« The assumption of unlf actoralness can be 

expected to be also satisfied In clo^e test responses* Thus, once It had 

* ' ' ' * 

been seen to th^t,?that the assumptions of the Rasch model were all fairly 




well met, the claims were r^iady to be tested. ' 

The score$ on the four auxiliary cloze te^ts (ACT*s.) were dichotomized 
oil a criterion scpre^ Their totals^ raiding from 0 to 4 were used as 
premeasures to create four distributions of four different skew magnitudes- 
but of similar mean, .standard deviation, and total sample size. The parent 
distribution had a* total of 226 Ss. The foCtr host distributions were 
created by calculating the numbers (f reguencles) to go In each of these 
* rive score groups so as to give the same reduced total of 189 Ss and^a 
Constant mean. The calculated frequencies were randomly drawn afresh 4 
times from within each of the five score groups of the parent distribution. 
In* order to alter the skew Biagnltude, the trend In the frequency distribution 



The ACT* 8 were used as premeasures to create the differently skewed dls- 
rtrlbutlons. It is customary in the literature of cloze test studies to use 
the,36Z score as the criterion to denote the demarcation between those x^;ho 
can "read" the passage and those who cannot* This was the criterion used 
here too. But SHch binary scores are not in anyway superior to the raw 
scores themselves apart f rpm being conyenient to calculate the means in -the 
trial-and-errpr creation of the four skew distributions* 

The qualifier "host" is used here to indicate that these distributions 
housed JLhe experimental Ss who were common to all of them. 
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of ACT scores change in skew in a certain direction waS aoted.y^nd then, 
TBither the low or the high scores, as was noted, were yeinoved or added ran- 
d'oinly to the score-group" of the host' distribution under ma'n-ipulation. 

a,b) ii) Procedure to test person-free and item-free ability 
Estimation by the Rasch model. Of the total 51 blanks or items, 11 
were discarded to satisfy the constant-discrimination-power assumption 

s 

demanded by the Rasch model. The proportion-right easiness values were 

> 

ranked and divided into three thirHs to give three it6m-sets of hard, 
hedlum and easy items. The Ss in the 4 skew groups (host distributions) 
were "administered'' all the three sets of items. Due to the particular 

. i r * ***** 

manney of the construction of the 4 skew groups, there happened to be 
26 §s'\^ho belonged to ail the" 4 skew groups and had nonzero add non perfect 
scores in all the three' item-sets. These formed the "experiiifental Ss" who 
yielded repeated measures on the pseudo-factors of. skew and easiness (below) . 
^^'^^ The Rasch ability estimates were the dependent variable(s). The (Corre- 
lation between these" RAE*s and the ECT raw score totals was controlled by 
making the latter a predictor in a linear prediction model. The correla- 
tion 222^ the dependent' variable (s) for the 12 'conditions 'of three easi- 
pesVlevels and ifour skew levels was taken into Account by making these 12 
sets'* as dependent varlable_s, with the assumption that they obey a 12-variate 
ftormai distribution with a common variance-covariance matrix (tables 6 a,b). 

The design is, thus, a multivariate linear prediction model as repre- . 

sented by the scalar equation; 

- + 6iX^ ^ Hi ^ 

or the vector equation: ' 

Here, ' ■ ' 

Y^j = the dependent variable, (Rasch .itblllty estimate in' the logarithmic 

10 



r 



10 

scale) a 

1 • 1, 2... 12; index for the 12 conditions of 4 skews and 3 easiness 
levels. 

j - 1, 2... 35 (maximum reached on the. 50. blank ECX) } index for raw' 
total scores on the ECT 

Xj« the predictor; ^the ECT, assumed to be error-free 

«!• the population value of intercept-like parameter; determines the Yii 
g zero value of X4 , ' > J 

Bi» the population varlue of siope-like parameter (direction number); 
determines the rate of variation of Y for unit variation in X 

Cij« population value of residuals; deviation of the predicted Y.. 
from the observed Y . , ; assumed to be normally and independently 
distributed. ^ 

This linear prediction model would permit the testing of equality of 

SAE's for the individual rather th^ the_over-all medn by'dint of the use , ^ 

\ ■ ' * ■ 

of the concept of "fit" of the regression lines. The ^pull and the research 

hypotheses kre that the intercepts and the slopes should be the same for 
the experimental Ss immaterial of what sfeew host they belong to and irres- 
pective of how hard or easy itemji that are administered tt them. In 
essense, the hypotheses are meant to test whether .^the prediction lines are 
collinear. if there is noncollinearity' owing to^rorfe of nieasurement and 
estimation ,1, the effect size of the ^departure in the popula^on should'^be 
minute. Obviously'i a large/sample size, should be used for this kind of 
goodness-of-fit like hypothesis. In other words, consideration of power 
or beta (type ^I) error is very relevant here. High power or low beta 
error should be aimed at. As far as type I error goes, we can set the 
alpha probability high (say, 0.4) to give the data 4t hand every chance 

rejecting the proposed null, even though w^ are really interested in 
retaining it or failing to reject- it. The multiple hypothese/ involved 
(in testing separately for the equality of intercept^ and the slopes and 
then these, separately wlthi|i each skew level for all the easiness levels 

ERIC ii 



u 

and also within each easiness level for all the skew levels) ' should auto- 
matically increase the alpha error for the whole experiment, even if that 
for the individual hypothesis is set low^ The multivariate design is 
illustrated in figure 1. 





3, 
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binary 
totals 



Fig. 1: Schematic diagram showing the lin^r prediction 
model for 12 dependent variables of 3 easiness levels and 
4 skew levels; illustration forgone score ^roup of the 
auxlliyfary cloze tests' dichotomized score totals.^ 

9 ^ 

Legend: S^, ^2'***^4 indicate the four skew levels; E, M, &'h denote 
the three easiness llBvels. 



The null or the research hypotheses are symbolically represented in * 



> 

the followitig equations: for any ^ne score group of the- ACT (see fig. 1),** 

^ ; ■ ■ ' 

In |)rinciple, a third hypothesis should also be tested on the equality of 
correlated varian<?e^(residual or predicted) or equivalently the equality 
of tightness of fit;' all th* three hypotheses together will go to test for 
the tollinearity <^f the pre<p.cti0n lines.. The pertinent tests for the last 
of these (to test the equa^lity of correlated variances) is given by Anderson 
(195^) in the form of a hypothesis equating two varlance-covariance matrices. 



If I there Is any reason to believe that the prediction lines are of equal 
intercept and slope across all the five score groupb of the auxiliary cloze 
testb, then tHe appropriate tests are the likelihood ratio tests given by 
GulUksen & Wilks (1950). This kind of different test is dictated for the 
reason that the assumption ^f random sampling does, not hold good across these 
scor^ groups which are merely different sections of the^ame bivarlte 
distribution pf the predictor and the predictand< in the prediction system. 

/ 12 
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First, we test, whether the intercepts are equal: ' - 

%lel " °sle2 * •••• '-%Ae3 - ' 
Second* we test -whether the corresponding slopes are equal: 

^slel " ^sle2 " *.•• " ^s4e3 in .the population. 
Here, "s" subscripts denote the skew levels and the "e" ones the easiness 
levels. ! 

To get the 12 sets of the Rasch ab.lllty estimates, 12 runs of the 
Rasch item analysis program (Wright & Panchapekesan, 1968) were made, 
administering Ss in the four skew distributions itlems in the three item- 
sets. As it was thought that the degrees of freedom for the denomenator 

> f 

ill the F-tests would be larger if only 8 dependent variables were used, the 
number of easiness levels was decreased from 3 to 2 for the important tests 
of persoi^- and item-free ability estimation. 

The Fortran program called "Mulgen" was used for the tests of contrast- 
and the over-all-hypotheses. . Only the experimental Sa in the zero group of 
ACT were used for, the analysis^. The latter was done on these 26 Ss common to 
the sfcew groups and with nonzero and nonperfect scores on the 3 item-sets 
I c,d) Prqi:edii?3B to test person-fr^e and item-free easiness estimation 
by the Rasch model: The procedure here was somewfiat similar to that in 
testing Rasth ^ility estimation. Of the 30 items in the parent distribu- 
tion, four host distributions wcjre created^ each containing 15 items. By 

trial-and-error, using the proportion correct easiness values as * the 

« 

measures for consideration, these 4 distributions "skew-sets" were construct- 
ed (table Ic). Five items common these skew-sets f^jpied the "experimental 
items". Three ability groups were created by raking the ECT total scores 
and taking the three thirds (one of tho^roups had 76 Ss while the other 
two had 75 Ss each, making the total of 226 Ss). Rasch easiness estimates* 
were the dependent variables for this part of the studyl These estimates ^ 
under 4 different skew conditions enabled the testing of item-fre^(skew- 



^^.v-J ; 

^ The correlations among these Rasch easiness estimates are in table 7. 
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free) easiness estimation' by the Rasch model. The same estimates measured 
under three ability conditions made possible the testing of person-free 
Rasch easiness estimations^ The twelve sets of Rasch easiness were obtained 
by twelve runs of the Rasch item analysis program as prepared by Wright & 
Panchapelcesan (1968) . Again, the f ortran program, meant for testing of 
multivariate general linear hypotheses, was used in 'the testing of . equality 
of intercepts and slopes of the prediction, lines. 

6. RESULTS 

Irrespective of the. total. number. of. individual. contrast. hypotheses 
tested, the type I error rate for these individual tests was set at .05 
level (Conclusions will not change even if it is set at .01 level, except- 
ing that there is no tabled F, for the degrees of freedom, 3 and 1, at 
.01 level for the tests of Rsisch easiness estimation). Assuming our cri- 
terion of collinear lines does indeed go to prove Rasch 's claims, we 
would expect failure to reject the nulls in all the individual contrast- 
'tests 9 If his claims were true. 

Now, going to table.2, where the case of person-free ability estimation 
is Reported, we find that rejection of the null is in ordei?. Within each* 
level of the item-sets, the Rasch ability estimates are not statistica?^ly 
the same across the skew-groups. In other words, skew does have an influ- 
ence on the Rasch ability estimation; or, Rasch model is not skew-free in 
the aspect of ability estimation. j 

Looking at table 3, where item-free ability estimation by Rasch model 
Is tested, we see that the slope and the intercept hypotheses are not 
rejected wjiile the over-all hypothesis is. Since these two types (indivi- 
dual and the over-all) of tests have different degrees of freedU, they 
cannot be compared. With reservations, it might be said that Rasch abili- ' 
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ties are free of what items are employed to arrive at them* 

The fact that at least one set of hypotheses for the tests of ability 
estimation was rejected (as in table 2), is reflected in the over-all test 
of both person- and item-free ability estimation, as in table 2'-3'. 

Table 4 gives i;he tests for the case of person-free easiness estimation. 
Noting that the degrees of freedom are rather small especially for the deno- 
aenator of the F-statistic, we find that the data txsed do not support the 
null, that the Rasch easiness estimates are unaffected by the diversity of 
person scores employed. 

In table 5 are reported the tests of item-free easiness estimation by 
the Rasch model. The intercepts are statistically different while the 
slopes are not (at the level of significance chosen). Hence these parallel 
Iln^ may be interpreted to mean that there is the effect of an additive 
constant in this case. By our criterion of collinear lines, Rasch easiness 
estimates are got item-free or skew^free. 

To sujaaarize, we conclude that item-free ability estimation by the Rasch 
model alone may have been supported by the data used in this, study. 

While the estimates of intercepts and the slopes appear to be practically 
about the same, they are not statistically the same in most cases. They 
need to be closer in magnitudes in the light of high correlations among 
' the dependent variables (namely, the Rasch estimates f6r the conditions of 
person-score-skew, item-easiness-skew, the abilit^r, and the easiness levels). 
But t&ing into account these correlations puts probably too great a cons- 
traint on the verifications. 



7. LIMITATIONS OF THE STUDr 



>le size is small for the Rasch modeljverifications and this 

/ 

can be considered serious -only in those tests where there was failure to 
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reject the null. This is to ensure or ascertain that the true null Is" re- 
tairied, in these cases, even though both the null and the research Jiypotheses 
were the same (like in the goodness-of-flt tests) In all verifications. 

bd There is, of course, the logical axiom that one can only disprove 
any proi^osition and not prove it. On this account, the very basis of this 
stijdj^ is shaky. More proper would have been a mathematical proof of the 
invariance of t^he Rasch estimates with respect to the considered manipulated 

\ 

variables using one or more of the invariance theorems. 

c) The creation of the various distributions of the different skews from 
one parent distribution can produce the observed skew-differences merely by 
the effect of random sampling; the different skews that were produced from 
one parent distribution were solely an artifact of the sampling fluctuations 
and for this reason, the skew magnitudes were close to one toother. This 
problem could have been circumvented by generating artificial data to create 
several parent distributions which themselves could be used to host the 
experimental Ss. 

d) The manner of creation of the host- distributions of several skew mag- 
nitudes might be questionable. Trial-and-error method of manipulating the 

. score distributions might be considered as quasi-random and not completely 
random. This might have affected the assumption of the independent chi- 
squares in the formation of the P-statistic^ 

c) la retrospective reflection, it is thpught that it was not necessary 
to have attempted to^keep the first two moments (namely, the mean and the 
standard deviation) of the parameter distributions constant. Any combina- 
tion of these might be varied as would be true in real life norm distribu- 
tions* Studying the effect of the variation of the skew of the distribu- 
tions alone or their mean alone would be only of theoretical or academic 
interest. 
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f) A plot of the Rasch abilities against the raw scores will be a 
logistic curve. But in this. study, straight lings were fitted and maf 
T^t"^^ have been ^proprlate. 4> 

g) In" thd ability veflficationd, the predictor variable for the fit 
oi the straight lines was the totals in the 30 Item eaqierloental cloze 
test, even though the Rasch abilities had been derived from the respec- 
tive subtest scores (easy, meditnn and hard subtests). These subtest 
totals would have been better predictors. In fact, going to table 6b 
and looking at the middle two columns, we see that the Pearson corre- 
lation between the medium subtest scores and the corresponding Rasch 
ability estimates is more or less uniformly .99 and that between the 
hard subtest scores and their corresponding Rasch abilities is very 
uniformly .98. But the correlations between the whole test scores and 
the nedlum-subtest Rasch abilities or the h«rd-s|Ubtest Rasch abilities 
are lower (nearly .79 and .59 respectively). In^Appendix is reported , 

an api^roximate repeat of all the previous inferential analyses, using 
these subtest scores as predictors. It will be ^secn that our conclti- 
sion on^ the possibility of itraa-rfree, Rasch ability estimation was 
•nullified. 

In view of the above limitations, the results of this study can, at 
best, be regarded as tentative and of restricted value. It is regrettable 
that even this elaborate effort could not pronounce the last word on the 
issue of the veracity of the Rasch model claims. 
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APPENDIX 

33tt the Rasch abllitx verifications, we had used the 30 item scores 
as the predictors Instead of the 10 item subtest scores. In view of the 

i 

hlghet correlations of the Rasch abilities with the latter, the predictive 

*^ power (predictable varianc^) will be Increased if these subtest scores 

1 

were used as predictors. Moreover, subtest scores are the more legitimate 
predictors for the reason that ghe Rasch abilities were in reality, derived 
from them. ^ 

Therefore, a reanalysls was made on the inferences of the eqtiallty of 
"intercepts" alone at various specific values of the predictor. As expec- 
ted the error variances dropped dramatically. jfto the range .p2-.56) froo 
what they were (range. 2-12) ^hen the whole-test scores were the predictor. 
The "intercepts", slope^MW^he F-statistic for the test of the equality 
of these intercepts (actually, the predicted* oeians of tl^e conditional dis- 
tributions of the Ras^h ability at specified predictor values) are given" ] 
In tables 2' ^' 3* and-2"-3". It is seen that the null hypotheses of the 
equal predicted-mean-Raach abilities are contradicted by the data in all 
casas of the 4 sk^ levels within the 2 easiness levels jsnd of the 2 easi- 
ness lev^s vitlUti each of the 4 skew levels. Going back to table 3, the 
rejection of the null in the, over-all tests seems to be supported by Jfhis 
reanalysls. Tiius, Rasch *s claims are contradicted in all the four res- 
pects of person- and item-free ability and easiness estimations. But a 
word of caution. The h*gh and sometl^mes inordinafcely hiph F values are due 
to the high and sometimes perfect correlations among the Rasch estimates* 

(as in the bottom line of table 6c) and l^etween the latter and the predictors 
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Table la. DistribTitioa. statistics for the testing <£ peTaotij^£Te9 (fekew-frce) ability - 
O^tlnatioa: (aii:<iliary close test - ACT) 



f 



•statistic . 


skew grcmp-1 ' skew group^2 


skew group-3 


skew grot^>-U 




. 1.08 


1.03 


1,07 


1.2> 


'delation 


l.lU 


1.Q9 


,1.21 


• 

1.23 , 


'skew 


.71. 


. .80 . 






kartoaio 


. -39 


t . 


W.12 


•.92 



. Tabid lb» Frequency distribution in the four skev^groups (26 S3 cceraca to all the h 
dccv-groupe) ^ ' * 



total Binary . ,ske-^^-grom>-l skew »rotxo*2 * cJr^J grouD^3 sk^^ grottry-U ^ • 
Sccnrain ACT * 



0 ' 

* 


,82 


7? 


■ 82 1' * 


82 


1 


36 


50 


50 • 


22 




52 


29 


52 


3 


' 12 






22 ' 


\ 


7 


. ^ i 


11- 


11 


total 


18? 


189 


• 18? 


189 



''' Tablo Ic. Distribution statistics f cr the teatin.'? of ito-frce Xz^^fya^tTC^) casinos: 
' ©j^tination: (rav/ oaslnaas in units of proportion correct) (5 items conmon to all the 

^ Af- skew-sets) 

i. . ■ •, ■ 

ots»|i8ttc - 25fw-sot - 1 skew»s«t - 2 okcw^set - 3 oliCi?-8ot - U 



ctandajrd 
do-viaticn 



FRir= 



osis 



.3? 
.17 

-.li9 



.iiO 




.37 

.20. 

-.99 



.37 

-17 
.17 

-.95 



Table lb': Rasch ability estioiates for the 4 skew levels and three It'ei^-sets ;• 
-easy, medium and hard , each of ten items.* (zero and perfect scores are omitted 
ia the Easch estimation procedure) : j 

HAW easy item-set 

SCORE log antilog 

skew-group-l 



1 


"2.356 


.095 


' 2 


-1.2495 


.224 


3 


-0.911 


.402. 


4 


-0.429 


.651. 


5 


O.014 


1.014 


.6 


■0.453 H 


1.573 


7 


0.926 


?.524 


8 


1.493 


4.452 


9 


2.330 


10.279 





medium 
log 


item-set 
antilog 


hard 
log 


item-set 
antilog 






-2 •337 


» — 

- 

.097- 


-2.288 


-.102 






-1.483 


■ .227 ■ 


-1.453 


.234 






-0.905 


.405 


-0.«91 


.410 






-0.426 


.653 


-0.426 


.653 








I.UIJ 


0.002 


1.002 






-0.448 


r 1.565 


^).430 


1.537 






0.917 


2.502 


0.893 


2.442 






1.481 


4.396 


• 1.453 


4.275 






2.314 


10.117 


2.284 


9.815 





8kew-group-2 



. 1 


-2.336 


. .097 


-2.355 


.095 


2 


-1.483 


.227 


-1.496' 


.224 


3 


-0.905 


-.405 ■ 


-0.912 


.402 


4 


-0.427 


.653 ■ 


-0.427 


.651 


5 


0.011 


i.ou 


0.014 


1.015 


6- 


0.447 


1.563 


0.454 


1.575 


'7 


0.916 


2.500 


' 0.927 


% 2.526 


8 


i.481 


4.397 


1.493 


4.451 


9 


2.315 


10.129 


2.329 


. 10.268 



-2.323 
-1.482 
-0.912 
rO.439 
0.001 
0.437 
0.911 
1.482 
2.325 



.098 

.in 

A02 
.645 
.999 
1.547 
2.487 
4.403 
10.231 



fllcew-gro«p-3. 



1 


-2.336 


.097 


-2U367 


.938 


-2.281 


.102 


2 


-1.485 


.226 


^1.504 


.222 


-1.449 


.235 


3 


•^.909 


.413 


-0.916 


.400 \ 


-0.889 


.411 


4 


''-0.431 


.650 


. -0.430 


' .651 


-0.426 


.'653 


S 


0.008 


1.008 


o.oie 


1.016 


0.001 


1.001 


6 


• 0*445 


1.561 


0.458 ' 


• 1.581 


0.427 


1.532 


7 


0.917 


2.502 


0.932 


2-540 


0.889 


2.433 


1 


• 1.484 


4.410 


1.500 


4.483 


1.449 


4.258 




2.321 


10.186 


2.337 


. 10.352 


2.280 


9.776 






J . 


c 








akev-group-4 




1 








1 


-2.332 


.097 


-2,348 


.096 


« 

-2.318 


\ 

.099 


2 


-1.482 


.227 


-1.492 


.225 


-1.478 


.228 


3 


•rO.905 


.404 


-0.910 


.402 


-0.909 


.403 


4 


-0.429 


.651 


-0.429 


.651 


-0.436 


.646 


5 


0.009 


1.000 


0.012 


■ 1.012 


0.000 


1.000 


6 


0.445 


1.561 


0.450 


1.569 


0.436 


1.546 


7 


0.915 


2.497 


0.922 


2.516 


0.908 


2.480 


8 


1.480 


4.393 


1.489 


4.434 


1.478 


4.383 


9 


2.315 


10.125 


2.326 


10.236 


,2.318 


10.160 



■> - 



1* 

) -' 
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Table ^. Slopes, intercepts fiid obsjsrvcd tt cri^iical F values:, case of person-free 
ability estimation by the Rasch model across h ske\-t levels in ability distribution: 
U dependent variables of Rasch ability estimates (R\E*s): .(the predictors were the 
totals in the'30-item ECT - sum* of easy, medium and Ifard subtest scores): 



bdtiidsitcs fox" 



intercept 
jditm set 
' . slope 



skew skew skev; , skew- 

group-1 gj^oup-2 group-3 group-It 



intercept 



lard set 



slope 



.3.1, 

...la- 

-3.1 

•-la 



- — — r- 
•3.0 

-*17'. 



obsPT^d cr^:t*cal 



•3.2 

\19> 



.*3^2- ^^3,1- 
•.U' .10 



-3.2 

V19 

-3.2 



22.h . ' 
(13.r (6,UIi) ) 
18.0 



(33.6* (6,m) 
5.7 , 



1.5 (735) 

2,4 im) 

'3,7 195%) 

4,2 (991) 



significant at .05 level 

^* 3/OTpSS y ^T1^6Z*C&O^S observed & critic: a F values: case of it en-free abi] 
estifflation by the Rasch mo<iel across 2 easiness levels: 2 dependent variables of Racch 
ability estimates (R^JS's): (the predictors or X's were the totals in the 30-item ECT; that 
i8,^the sum of easy, medium arid ^hard. subtests) : 



estimates of 



Intercept 
slope 



intercept 
slope * 



ikew ^ 
:rom^3 



Intercept 



slope 



Intercept 



slope 



ERIC 



medim-e asine ss 
item set 



, low-easiness (harcOt 
iteA set , • 



-3.1 

•18 

-3.0 
.17 

»»3.2. 
,19 

-3.2 
.19 



^3.'1' 

:=;ii 



r3;> 

.10 

-3^2 
iio 



pbserved critical 
F(1,2U) 



1.6 



1,4 {751) 
.2.9 (905) 



0.02 . 4,3 (955) 

ao,l(2,2li)f 7,S (99S) 
2.6 



ao*7(2,22i>f 
2ili 



21 



significant at .05 level 



1 
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1^'^^ ^^°P®s> intercepts snd observed ?: critical F values- c-se of nor-on p 



-estimates of! '^^^ skew-3 Tske^^-U observed cin>iccd 
t med. h ^rd rned. h-'-rd red, h^rd t^-^-rri | . ?(7,ie) 

ttt: 



intercept | -3.2 -3.1 -3.1 -^.1 -3.2 -3.1 .3.2->-3.1i 27.6 '-'^ •> 

1.18 .IX ..17 .11 .18 .11 .18 .U I 12.8 '^^-^^ J--:/ 



. slope 



^significant at .0^ level 
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' Table li. Slopes, intercepts -nd observed & critical F values* , "case of •nerson-fv«tt -aiJ 



oatiinate for 



»i. 



intercept 
slope 

intercept 
slope 

intercept 
slope 



intercept 
R slope 



low ability neditim ability high ability 



-2;3 
.07 

-2.7. 
.07 

-2.2 
•07 

.2.1i 
.07 



.*1.8 
.05 



-2,1 

>".Q5 

'•.05 

-1.9 . 

..05 

•1.9 

. .05- 



observed critical 
FC3,1) 



I1051.2 

(h9.6 (6,2))* 
3526.1i 



5016.U . . 
(59.9 ")* &,2i7S%). 

Si: 6 [90%] 

3908.5 ^ 

(^2.7 ") 
3836.U 



3986.9 . 
(^7.6 ")* 

3.825.9- 



significant at .05 level 



easinesI1stl«^t?i?'^^.^*l''°t*^ f ? observed & critical F values: case of item-fx-ee 
rSSent J^Sb^ef -odel across h skew values in easiness distribution: 



cup 



estimate for 


skeul 
set^l 


skew 
set-2 


skew- 
set-3 


skew, 
set<-u 


• 

observed critic ^.1 
■f'(3,l) ^ . 


^Intercept 
slope 


^07 


-2.7 
*07 


-2.2 
.07 


-2.1i 

til 


> 2000 ^-2 (755)^ . 
O2000 j^^^li ^ 

■ . r . 


Intwcept 
slope 


-1.9 

.0$ 


-2*2 • 
.05 


«1.8 
.05 


•1.8 


> 2000 

(> 2000 .. ) 
7.2 


Intercept 
slope j 


.0? 


•2^0 

.05 


-1.9 , 
.05 


-1.9 
.0^ 


^ 2000 

(> 2000 )* 

.73 



:Er|c 



23 



significant at .05 level 



1 

f 
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Tr:ble 2' Slopes, in-srccptri -;nd observed <^v- critic '=1 ^ values:* cose of peraon-frco 
ability -estir:--ticn bv th.o ?-ooh nodal -c.-osf? ii slccvr Icvclss in c-bilily distribution: 
^^^^^''^^'''"''^^-''1^^° °'' -'••ilit'Aestiy.'.tcs (R u's): (X=Xi or X? accordine to 

whether medium or hard subtest is used; fc = medium subtest scores; Xo = ^ard sSbtest 



est'LTi-^tcs Tor 



intercept 
hadiim set 

slcoe 



skev 



;;roup-li 



hard set 



interceot 




cb For^'c^ nri tl c iX 
•• ?C3,22) 



2456.1 (X=2) 
■ 36.2 (X»5)' 
.429. 6 (X=8) 



> lOy (X-2) ! 

> 10, (X=5) § 

> 10 (X=.8) 5 



4.2 im) 



«o^^« ^ Inordinately high 7's are due to the perfect co'rr'elatlons'iong the 

estimates of ability (see bottom -line of - table 6c). - 

h?rrf l'^;" i- ^^"^1 ^ "^^^ predictors; X^ = medium subtest scores J 

fied "5- vSues) ""^ ^inferential test ^as at the predicted means of ME's at the spec] 



estin-tes of 



xXihfiToept 
.«? coefflclent-l 
coefflclent-Z 



necUA:r:~car>inGss 
iten set 



'-reHsinsss (hard) 



it^ not 



obser'/ed 
F(l,23) 



intercsot 
I^St^ coefflclent-1 
coefflclent-2 



intercent 

?VuUp«3 coef f lclent-1 
coefflclent-2 



^ ^ intercept 
IVrCja-h coef £icient-l 
coeffiqient-2 

ERLC= 




critic ;.tl 
F(l,24) 



439.5 (X=2) 
1367.1 ■(X=5) 
1228.1 (X=8) 



680.8 (X=2) 
1834.9 (X-5) 
1641.0 (X'=8) 



417 (X«2) 
1355.2 (X=5) 
1230.9 (X=.8) 



463.6 (X=2) 
1409.9 (X=5) 
1259.2 (X=8) 



1.4 (751) 
2.9 {9Qi) 



4.3 (951) 



Table 2"-3". Intercepts, slopes, observed and critical P values: case of person- ^nd 
Iten free ability estimation by the Rasch model across 4 skew levels and 2 easiness levels: 
8 dependent variables of Rasch ability estimates (RAE's): (X - and X2 are the predictors; 
Xi • medium subtest scores; X2 » hard subtest scores) (inferential tests were at the pre- 
dicted means of RAE's at the specified valubs of "X"): '■ \ 



estimates of 



skew-1 skew-2 skew-3 skew-4 
med. hard med. hard med. hard. med. hard 



observed critical: 
F(7,i7.> 



Intercept ^ -2.66 -2.72 -2.57-2.76-2.69-2.71-2.67-2.76 ^ <X-2) j 5 j^j^j 

coefficient-1 .56 -.03 . 56 -.03 .57 .-03 . 56 -.03 ; ^^0^5 (X-5) g'J [95I) ^ 



coefflcientr2 -.05 .58 -.11 .59- -.05 .58 -.05 



-.59 



>10^5 (x-8rij^ 



5 . . 

These tests are meaningless since the inordinately high P's are forced by the 
the perfect correlations as in the bottom line of t^ble 6c among Rasch ability estimates. 
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-25- 



6a)Co»relatipns among the independent variables: X,=medium subtest scores? 

V-.= Va-^ U„-U*.^«*. v—.-u-,. ^ J- • ' 





ana x^ 


kl and X 


X2 and x 


Pearson r^x 


.31 


' .80 . 


.62 


— s 








6b) Correlations between the de^pendent and independent variables: Raschyl=Rasch 
abilities! for meditim subtest> Raschy2=Ra6ch abilities for hard subtest; si.. 
. . . s4=sk^-groUp-l . . . skew-group-4 : . 







teari^on t 
for 1 


and Raschy2 


X-j^ and Raschyl 
» ^ 


X^ and Raschy2 


X2 and Raschy: 


si ; 

s2 / 

/ i 

S3 

* s4 .' 


.24 ' '■ 
.24 

• ;24 

.24 


• .99 -{.79)* 
.98 (.76)* 
.99 (.79)* 
.99 (.79)'* 


..98 ( . 58) * 
. 98 ( . 59) * 
.98 (.58)* 
.98 (.59)* 


.25 
.17 
.25. 
.25 



abilJLties' 



6c)Cbrrelati< 
4 skew le' 


Dns among the dependent variables ^ the Rasch 
yrels and 2 dasinesi^ levels: 


ability estimates, for 


. . 


skew-group-1 skew-group-2 skew-group- 


■3 skew-group-4 


Rii|oiiyl 
Raschy2 


. ^96 . 996 ( 
, .18 .11 I18 
1.0 1.0 


1.0 ■ 

.18 

1.0 




4 


1 * 



ERIC 



T *~ trS^ . - . 

T^Gorrelations among the dependent variables" of Rasch, easiness estimates 



ability 
group 


skew-set-1 


— 


•skew-Aset-2 




_Skew-setr3 




• 

skew-rset-4 


low 




1.0 




1.0 




1.0 






.59 




.60 




.60 




.60 ' ' \ ( 


me'dixun 


(.08)*. 


1.0 


(.07)* 


liO^ 


(.08)* 


1.0 


(.08)* ) 




.49 




.49 




.49 




.4.9 C. 


high 
* 




1.0 




1.0 




1.0 





but within a particular skew-set 
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